An Introduction to Effective BLASTing
نویسنده
چکیده
Sequence-alignments in general and BLAST searches in particular have become a ubiquitous part of molecular biology. Despite its popularity, the vast array of BLAST tools and parameter choices can overwhelm the user. Yet accepting the default parameters can greatly reduce search sensitivity and accuracy. This review focuses on the major parameters for BLASTN and BLASTP searches, and discusses both their default values and how they can be tweaked to enhance query results. Introduction Computational biology can be defined as the use of quantitative, mathematical models to study biological questions (1). This covers a broad spectrum of research questions, ranging from phylogenetic studies (2) to the discovery of new genes (3) and splice-variants (4), and from the prediction of transcription-factor bindingsites (5 ) to the integration of large transcriptomic and genomic datasets (6). With such a diverse set of questions, it might be surprising that there is a common “battery” of computational and mathematical techniques used in their solutions. In a way, this is akin to standard molecular techniques like PCR or Western Blotting, which are broadly applied to answer many distinct research questions from a molecular perspective. This “toolbox” of computational techniques includes pattern-recognition techniques like clustering (7), sequencemodeling procedures like Hidden Markov Models (8), and a wide-range of statistical and mathematical procedures (9-11). Perhaps the most ubiquitous computational technique, however, is sequence-alignment. The most common sequence-alignment program, NCBI BLAST, is used tens of thousands of times each day (12). This review aims on introducing the reader to the many parameters available for tuning and effectively using NCBI BLAST. Following a brief overview of BLAST, the two canonical forms of BLAST are introduced. The parameters for each form are detailed, and recommendations on parameter selection are given. The Problem of Sequence Alignments and the BLAST Solution Sequence-alignments are a core element of the computational biology tool-box and are extensively used to study the primary structure of proteins and nucleic acids. Fundamentally, a sequence-alignment is a way of comparing sequences to one another. Thus, sequencealignments can find use whenever sequences are being studied. Typical uses of sequencealignments include the identification of cDNA clones in a library (13), the discovery of splicevariants in large sequence-databases (1 4), functional characterization of uncharacterized genes (15), and evolutionary studies of specific proteins or genes (16). Regardless of the application, sequence-alignments are a way of determining “how similar” sequences are to one another. Regions that are similar can be overlapped, or “aligned”. The graphical display of this alignment gives the technique its name. Sequences that are very similar will show “strong” alignments, meaning that few mismatches or gaps exist. Algorithms exist to align pairs of sequences (pair-wise sequence alignment) as well as to align larger numbers of sequences (multiple sequence alignment). This review focuses exclusively on pair-wise sequence alignments. Effective BLASTing – TIPS & TECHNIQUES Hypothesis 27 Because of its similarity to classical computer science problems, pair-wise sequencealignment has been extensively studied. An optimal algorithm exists to align two sequences to one another based on a computational technique called dynamic programming. Unfortunately these optimal alignments – often called Smith/Waterman alignments – are extremely slow. Even on very fast computers, comprehensive database searches using optimal Smith/Waterman alignments can run prohibitively slowly (17). This is where BLAST comes in. The basic local alignment search tool is based on a statistical approximation used to speed-up Smith/Waterman alignments. By assuming that the best local alignment will contain a small, exact match (Figure 1) the execution time of large database searches can be dramatically reduced (17).
منابع مشابه
A quantitative model for evaluation and classification of blastings in open-pit mines
By evaluation of the blasting results, a proper blast pattern can be presented. It is, therefore, essential to employ a reliable method to evaluate blastings for the effective control and optimization of the main cycle operations. This paper aims to propose a criterion for evaluating the blasting results such as the fragmentation, muckpile condition, back-break, and fly rock, and to make a poss...
متن کاملOptimization of Drilling Layouts Based on Controlled Presplitting Blasting through Strata for Gas Drainage in Coal Roadway Strips
The controlled presplitting blasting technique is widely used in mining engineering to improve the permeability and gas extraction efficiency of coal seams. One of the key factors is the appropriate arrangement of the blasting and drainage holes, which can help improve the gas drainage quantity. To optimize the drilling layout to enhance gas-drainage efficiency, a series of controlled presplitt...
متن کاملInvestigation of the rock blast fragmentation based on the specific explosive energy and in-situ block size
The assessment of fragmentation through blasting and therefore subsequent crushing and grinding stages is important in order to control and optimize the mining operation. Prediction of the mean size of fragmented rock by the rock mass characteristics, the blasting geometry, the technical parameters and the explosive properties is an important challenge for the blasting engineers. Some of the ef...
متن کاملPrediction of fragmentation due to blasting using mutual information and rock engineering system; case study: Meydook copper mine
One of the key outcomes of blasting in mines is found to be rock fragmentation which profoundly affects downstream expenses. In fact, size prediction of rock fragmentation is the first leap towards the optimization of blasting design parameters. This paper makes an attempt to present a model to predict rock fragmentation using Mutual Information (MI) in Meydook copper mine. Ten parameters are c...
متن کاملApplication of simulated annealing for optimization of blasting costs due to air overpressure constraints in open-pit mines
Estimating the costs of blasting operations is an important parameter in open-pit mining. Blasting and rock fragmentation depend on two groups of variables. The first group consists of mass properties, which are uncontrollable, and the second one is the drill-and-blast design parameters, which can be controlled and optimized. The design parameters include burden, spacing, hole length, hole diam...
متن کاملA comparison between effects of earthquake and blasting on stability of mine slopes: a case study of Chadormalu open-pit mine
Dynamic slope stability in open-pit mines still remains a challenging task in the computational mining design. Earthquake and blasting are two significant sources of dynamic loads that can cause many damages to open-pit mines in active seismic areas and during exploitation cycles. In this work, the effects of earthquake and blasting on the stability of the NW slope of Chadormalu mine are compar...
متن کامل